Maintaining Consistency of Probabilistic Databases: A Linear Programming Approach

نویسندگان

  • You Wu
  • Wilfred Ng
چکیده

The problem of maintaining consistency via functional dependencies (FDs) has been studied and analyzed extensively within traditional database settings. There have also been many probabilistic data models proposed in the past decades. However, the problem of maintaining consistency in probabilistic relations via FDs is still unclear. In this paper, we clarify the concept of FDs in probabilistic relations and present an efficient chase algorithm LPChase(r,F) for maintaining consistency of a probabilistic relation r with respect to an FD set F . LPChase(r,F) adopts a novel approach that uses Linear Programming (LP) method to modify the probability of data values in r. There are many benefits of our approach. First, LPChase(r,F) guarantees that the output result is always the minimal change to r. Second, assuming that the expected size of an active domain consisting data values with non-zero probability is fixed, we demonstrate the interesting result that the LP solving time in LPChase(r,F) decreases as the probabilistic data domains grow, and becomes negligible for large domain size. On the other hand, the I/O time and modeling time become stable even when the domain size increases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-choice stochastic bi-level programming problem in cooperative nature via fuzzy programming approach

In this paper, a Multi-Choice Stochastic Bi-Level Programming Problem (MCSBLPP) is considered where all the parameters of constraints are followed by normal distribution. The cost coefficients of the objective functions are multi-choice types. At first, all the probabilistic constraints are transformed into deterministic constraints using stochastic programming approach. Further, a general tran...

متن کامل

Probabilistic Consistency Boosts MAC and SAC

Constraint Satisfaction Problems (CSPs) are ubiquitous in Artificial Intelligence. The backtrack algorithms that maintain some local consistency during search have become the de facto standard to solve CSPs. Maintaining higher levels of consistency, generally, reduces the search effort. However, due to ineffective constraint propagation, it often penalises the search algorithm in terms of time....

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

Stochastic Approach to Vehicle Routing Problem: Development and Theories

Stochastic Approach to Vehicle Routing Problem: Development and Theories Abstract In this article, a chance constrained (CCP) formulation of the Vehicle Routing Problem (VRP) is proposed. The reality is that once we convert some special form of probabilistic constraint into their equivalent deterministic form then a nonlinear constraint generates. Knowing that reliable computer software...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010